首页> 外文OA文献 >Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances

Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances

机译:基于通用相似度量的压缩率仍然有效   蛋白质距离远离CaTH距离



Kolmogorov complexity has inspired several alignment-free distance measures,based on the comparison of lengths of compressions, which have been appliedsuccessfully in many areas. One of these measures, the so-called UniversalSimilarity Metric (USM), has been used by Krasnogor and Pelta to compare simpleprotein contact maps, showing that it yielded good clustering on four smalldatasets. We report an extensive test of this metric using a much larger andrepresentative protein dataset: the domain dataset used by Sierk and Pearson toevaluate seven protein structure comparison methods and two protein sequencecomparison methods. One result is that Krasnogor-Pelta method has less domaindiscriminant power than any one of the methods considered by Sierk and Pearsonwhen using these simple contact maps. In another test, we found that the USMbased distance has low agreement with the CATH tree structure for the samebenchmark of Sierk and Pearson. In any case, its agreement is lower than theone of a standard sequential alignment method, SSEARCH. Finally, we manuallyfound lots of small subsets of the database that are better clustered usingSSEARCH than USM, to confirm that Krasnogor-Pelta's conclusions were based ondatasets that were too small.



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号